22. San Francisco, CA.
I've only been programming for a year, but it already feels like a lifetime. Palautatan means "statistics" in Tagalog.

View Home Page | View My GitHub Profile
assignment2-Copy1

<font face = "courier"> <font color = "green">

Edie Espejo


2017-01-29


Notes

</font>

  • Collaborators: Colin Santos, Chad Pickering

  • Sources: Safari Books Online, NumPy Reference Guide, Crash Course on NumPy for Images
  • </font>

    Assignment 2

    Part 1: Image Processing Basics

    Computers use tiny dots called pixels to display images. Each pixel is stored as an array of numbers that represent color intensities.

    Example. In an 8-bit grayscale image, each pixel is a single number. The number represents light intensity ranging from black (0) to white (255).

    Example. In a 24-bit RGB color image, each pixel is an array of 3 numbers. These numbers range from 0 to 255 and represent red, green, and blue intensity, respectively. For instance, (0, 0, 255) is bright blue and (255, 128, 0) is orange.

    In this assignment, you'll use Python and NumPy to manipulate 24-bit RGB color images.

    You can use Image.open() from the Python imaging library (PIL) to open an image:

    In [8]:
    import matplotlib as plt
    import os
    from PIL import Image
    
    # Cat image from https://unsplash.com/photos/FqkBXo2Nkq0
    cat_img = Image.open("sharpnack.jpg")
    cat_img.LOAD_TRUNCATED_IMAGES = True
    

    Images display inline in Jupyter notebooks:

    In [5]:
    # os.chdir("C:/Users/ehze/Dropbox/sta141b/assignment2")
    
    In [6]:
    os.getcwd()
    
    Out[6]:
    '/Users/EDIE/Dropbox/sta141b/assignment2'
    In [7]:
    cat_img
    
    Out[7]:

    In a Python terminal, you can display the image in a new window with .show() instead.

    NumPy can convert images to arrays:

    In [4]:
    import numpy as np
    
    cat = np.array(cat_img)
    

    To convert an array back to an image (for display) use the function below:

    In [5]:
    def as_image(x):
        """Convert an ndarray to an Image.
        
        Args:
            x (ndarray): The array of pixels.
            
        Returns:
            Image: The Image object.
        """
        return Image.fromarray(np.uint8(x))
    
    
    # Trying out the function
    as_image(cat)
    
    Out[5]:

    Exercise 1.1. How many dimensions does the cat array have? What does each dimension represent?

    <font face = "courier"> <font color = "green">

    Exercise 1.1

    </font>

    The dimensions that the cat array has are (267, 400, 3). These dimensions represent the number of nested lists and objects the array is made of. Within the list, i.e. cat[0], we have 267 lists of lists, which stand for the amount of vertical pixels. Within those lists of lists, we have 400 lists, which stands for the amount of horizontal pixels. Then within each of those 400 lists, we have 3 entries of numbers which represent pixels in RGB. </font>

    In [6]:
    print(cat.shape) # (267,400,3)
    print(cat[0].shape) #(400,3)
    print(cat[0][0].shape) #(3,)
    
    (2048, 1365, 3)
    (1365, 3)
    (3,)
    

    Exercise 1.2. Use .copy() to copy the cat array to a new variable. Swap the green and blue color channels in the copy. Display the result.

    In [7]:
    # Exercise 1.2
    
    cat2 = cat.copy()
    # cat2 == cat
    
    # RGB -> R"BG"
    for each_grouping in cat2:
        each_grouping[:,[0, 1, 2]] = each_grouping[:,[0, 2, 1]]
    
    as_image(cat2)
    
    Out[7]:

    Exercise 1.3. Why is .copy() necessary in exercise 1.2? What happens if you don't use .copy()?

    <font face = "courier"> <font color = "green">

    Exercise 1.3

    </font>

    The use of .copy() is necessary in the previous exercise because when you slice arrays in numpy, even when you're reassigning your array to a new name, it slices your original array too. If you don't use copy, then you will lose your original array. </font>

    Exercise 1.4. Flip the blue color channel from left to right. Display the resulting image. Hint: see the NumPy documentation on array manipulation routines.

    In [16]:
    # Exercise 1.4
    
    cat3 = cat.copy()
    cat3[:,:,2] = np.fliplr(cat[:,:,2])
    as_image(cat3)
    
    Out[16]:

    Part 2: Singular Value Decomposition

    Suppose $X$ is an $n \times p$ matrix (for instance, one color channel of the cat image). The singular value decomposition (SVD) factors $X$ as $X = UD V^T$, where:

    • $U$ is an $n \times n$ orthogonal matrix
    • $D$ is an $n \times p$ matrix with zeroes everywhere except the diagonal
    • $V$ is an $p \times p$ orthogonal matrix

    Note that a matrix $A$ is orthogonal when $A^T A = I$ and $AA^T = I$.

    Example. We can use NumPy to compute the SVD for a matrix:

    In [9]:
    x = np.array(
        [[0, 2, 3],
         [3, 2, 1]]
    )
    u, d, vt = np.linalg.svd(x)
    # Here d is 2x2 because NumPy only returns the diagonal of D.
    print "u is:\n", u, "\nd is:\n", d, "\nv^T is:\n", vt
    
    u is:
    [[-0.68145174 -0.73186305]
     [-0.73186305  0.68145174]] 
    d is:
    [ 4.52966162  2.54600974] 
    v^T is:
    [[-0.48471372 -0.62402665 -0.6128975 ]
     [ 0.80296442 -0.03960025 -0.59470998]
     [ 0.34684399 -0.78039897  0.52026598]]
    

    If we let

    • $u_i$ denote the $i$th column of $U$
    • $d_i$ denote the $i$th diagonal element of $D$
    • $v_i$ denote the $i$th column of $V$

    then we can write the SVD as $\ X = UDV^T = d_1 u_1 v_1^T + \ldots + d_m u_m v_m^T\ $ using the rules of matrix multiplication. In other words, the SVD decomposes $X$ into a sum!

    If we eliminate some of the terms in the sum, we get a simple approximation for $X$. For instance, we could eliminate all but first 3 terms to get the approximation $X \approx d_1 u_1 v_1^T + d_2 u_2 v_2^T + d_3 u_3 v_3^T$. This is the same as if we:

    • Zero all but the first 3 diagonal elements of $D$ to get $D_3$, then compute $X \approx UD_3V^T$
    • Eliminate all but the first 3 columns of $V$ to get $p \times 3$ matrix $V_3$, then compute $X \approx UDV_3^T$

    We always eliminate terms starting from the end rather than the beginning, because these terms contribute the least to $X$.

    Why would we want to approximate a matrix $X$?

    In statistics, principal components analysis uses this approximation to reduce the dimension (number of covariates) in a centered (mean 0) data set. The vectors $d_i u_i$ are called the principal components of $X$. The vectors $v_i^T$ are called the basis vectors. Note that both depend on $X$. The dimension is reduced by using the first $q$ principal components instead of the original $p$ covariates. In other words, the $n \times p$ data $X$ is replaced by the $n \times q$ data $UD_q = XV_q$

    In computing, this approximation is sometimes used to reduce the number of bits needed to store a matrix (or image). If $q$ terms are kept, then only $nq + pq$ values (for $XV_q$ and $V_q^T$) need to be stored instead of the uncompressed $np$ values.

    Exercise 2.1. Write the functions described below.

    • A function that takes a matrix $X$ and returns its principal component matrix $XV_q$ and basis matrix $V_q^T$. This function should also take the number of terms kept $q$ as an argument.

    • A function that takes a principal component matrix $XV_q$ and basis matrix $V_q^T$ and returns an approximation $\hat{X}$ for the original matrix.

    As usual, make sure to document your functions. Test your function on the red color channel of the cat image. What's the smallest number of terms where the cat is still recognizable as a cat?

    <font face = "courier"> <font color = "green">

    Exercise 2.1

    </font>

    This is obviously a subjective answer, but the smallest number of terms where I can personally recognize the cat as a cat is 15 terms. When we use lower than 15 terms, I think the cat gets too leveled in with the grass. </font>

    In [12]:
    # First Function
    def pcBasisMatrix(matx, q):
        '''
        This function takes in a matrix x and returns its principal component matrix and basis matrix.
        The value q is the number of terms.
        '''
        u, d, vt = np.linalg.svd(matx)
        vq = np.transpose(vt)[:,:q]
        pcm = np.dot(matx, vq)
        bm = np.transpose(vq)
        return(pcm, bm)
    
    
    # Second Function
    def approxMatrix(pcm, bm):
        approx_x = np.dot(pcm,bm)
        return(approx_x)
    
    
    # Try on kitty's red channel
    def scaryCat(matx, q):
        pcm, bm = pcBasisMatrix(matx, q)
        approx = approxMatrix(pcm, bm)
        return(approx)
    
    # Perhaps subjective, but I can tell kitty's a kitty at 15. So any less, I wouldn't be able to see.
    as_image(scaryCat(cat[:,:,0],13))
    
    Out[12]:

    Exercise 2.2. You can check the number of bytes used by a NumPy array with the .nbytes attribute. How many bytes does the red color channel of the cat image use? How many bytes does the compressed version use when 10 terms are kept? What percentage of the original size is this?

    <font face = "courier"> <font color = "green">

    Exercise 2.2

    </font>

    The original red color channel of the cat image uses 106,800 bytes.
    The compressed red color channel of the cat image uses 53,360 bytes.
    The compressed version is 49.9625% the size of the original version. </font>

    In [13]:
    # Exercise 2.2
    
    original_bytes = cat[:,:,0].nbytes
    print("Original bytes: "+str(original_bytes))
    
    pcm, bm = pcBasisMatrix(cat[:,:,0], 10)
    compressed_bytes = pcm.nbytes+bm.nbytes
    print("Compressed bytes: "+str(compressed_bytes))
    
    Original bytes: 2795520
    Compressed bytes: 273040
    
    In [14]:
    ratio = (float)(pcm.nbytes+bm.nbytes)/cat[:,:,0].nbytes
    print("Percentage of Original Size: "+str(ratio*100)+"%")
    
    Percentage of Original Size: 9.76705586081%
    
    In [ ]:
     
    
    View Home Page | View My GitHub Profile